-
Notifications
You must be signed in to change notification settings - Fork 0
Initialization hard fault memaccess #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes nondeterministic hard faults caused by uninitialized memory accesses in the BNO080 sensor module. The issue occurred when read() was called before the ISR had populated sensor data arrays, resulting in invalid pointer dereferences.
- Initializes all sensor data arrays (quaternion, accelerometer, gyroscope, magnetometer, gravity) to valid float objects (0.0) in the constructor
- Re-initializes these arrays in the reset function to handle device resets
- Adds defensive NULL checks in the read function to catch any remaining edge cases
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| STATIC void bno080_init_data_arrays(bno080_BNO080_obj_t *self) { | ||
| for (int i = 0; i < QUAT_DIMENSION; i++) { | ||
| self->fquat[i] = mp_obj_new_float(0.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn’t this create a memory leak? I might be wrong, but from what I recall, mp_obj_new_float allocates a new object on the heap. If self->fquat[i] is not NULL, the existing pointer will be overwritten and effectively lost. The garbage collector is probably helping with this but at the performance cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you are correct. I first noticed this pattern in the report function when debugging, thinking that this could be a reason for the hard fault (see fn pasted below). But, after looking into it further, since self->fquat points to an mp_obj_t array and Python (and therefore MicroPython) floats are immutable, the struct has to be updated with a new float object pointer on each update. Besides, I haven't seen any problems with memory allocation with the current implementation.
I guess an alternative solution would be to try using the native float instead of an mp_obj_t array (since we don't access self.fquat from the Python layer anyways). I think this would probably be a separate PR though - let me know your thoughts.
STATIC void bno080_report_rotation(bno080_BNO080_obj_t *self, elapsed_t timestamp, const uint8_t *pkt, int len) {
/**
...
*/
uint8_t qp = 14; /// per section 6.5.19 Q Point = 14
// https://en.wikipedia.org/wiki/Q_(number_format)
float scale = pow(2.0, -qp);
self->fquat[0] = mp_obj_new_float(READ_LE(int16_t, &pkt[4]) * scale); // i
self->fquat[1] = mp_obj_new_float(READ_LE(int16_t, &pkt[6]) * scale); // j
self->fquat[2] = mp_obj_new_float(READ_LE(int16_t, &pkt[8]) * scale); // k
self->fquat[3] = mp_obj_new_float(READ_LE(int16_t, &pkt[10]) * scale); // real
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was curious whether making that change would improve performance so I tried it in this PR here as well. Haven't been able to test longer-term performance (which I assume is where we would see big changes), but the short term sampling rate is pretty much the same (and likely limited by the Python layer regardless). #11
I'm trying to stabilize the native bno implementation for circuitpython as part of the work to clean up the cionic-circuitpython repo. I was running into nondeterministic hard faults due to invalid memory accesses for a while so tried to dig in and identify the root cause. I found out that the cause for this is in this function:
and it seems to be that the fquat ptrs are sometimes uninitialized when the python layer calls
bno080.read()for the first time. This happens whenbno080_report_rotationhasn't been called yet (i.e. if the ISR has not picked up the channel report yet). I'm pretty confident that this is the problem but I wasn't as sure about the best fix. In my head the options were:Here, I implemented bullet 1 and 2 to be extra defensive. I initialized data pointers in the constructor and also check for invalid ptrs in the read function.
Note that this implementation will lead to some garbage values (0,0,0,0) for data streams in the case that a crash would have happened instead. A future implementation could be to return NULL instead of garbage values and update the python-side CionicOrientation class implementation to handle cases where bno08x.read() returns NULL, but I went with this because there are many examples in cionic-circuitpython in non-main branches which would also be affected by this change.